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ABSTRACT 

Data from a distance education project. Integrated 
Science 7 (l'37) , were used to compare block designs with analysis of 
covariance (ANCOVA) for their ability to increase statistical power. 
The IS7 program enables students in grades six through eight to study 
science via satellite. A sample of 1,802 students from a pilot of the 
program yielded data for the study. The treatment by blocks designed 
were formed using the posttest aptitude scores as the dependent 
variable, Lhe pretest scores as the concomitant or blocking variable, 
and the gender of the subjects of the independent variable. 
Two-block, 5-block, and 10-block designs were compared to 2 ANCOVA 
analyses. With the data used, the 10-block design appeared preferable 
to ANCOVA, but overall results suggest that there is no one optimal 
method. The use of ANCOVA versus blocking is dependent on several 
conditions, as discussed. Eleven tables present data from the 
analyses. Contains 49 references. (SLD) 
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INTRODUCTION 



One task of educational researchers is to determine the most appropriate statistical 
analysis to apply to their data. A primary goal in experimental design is being able to select the 
most powerful statistical procedure. A lack of statistical power indicates that a study's 
examination of significant effects fails to show effects even when they exist (Type II error). 
The purpose of this study was to evaluate several statistical designs purported to increase 
power. Specifically, block designs were compared to analysis of covariance designs using data 
from a distance education project, Integrated Science 7 (IS7). 

Influences on Statistical Power 

Statistical power is influenced by several factors: "the size of the sample, the degree of 
variability in the dependent variable, the choice of research design and the method of statistical 
analysis, the significance level chosen by the researcher, and the magnitude of the treatment 
effect" (Porter & Raudenbush, 1987, p. 385). Often, the researcher has limited control over 
these factors. For example, once the population and dependent variables are selected, the 
researcher cannot always control the amount of variability. Additionally, the treatment effect 
is not controlled because it is not part of the design itself. Convention restricts the choice of the 
significance level to .05 or lower. The sample size is dependent upon such variables as cost, 
time restraints, the availability of subjects, and the number of trained observers. However, 
the researcher's choice of experimental design and statistical analysis may also influence 
power. 

In determining what is meant by power, Benton (1990) indicated the following: 

The power of a statistical test is the probability, given the Ho (the null 
hypotheses) is false, of obtaining sample results that will lead to its rejection. . . 
In other words, a powerful test is one that has a high probability of claiming that 
a difference exists when it really does (p. 266). 
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He stated that the power of a test is dependent upon several factors including "(a) the size of 
true treatment effects, (b) the sample size, (c) the degree of error variance, and (d) the 
significance level" (p. 2). He explained that one of the first steps in planning an experiment 
should be the consideration of power. Benton further elucidated, "the power of a test is equal to 
one minus beta (where beta is the probability of a Type II error), and is determined by the 
four factors listed previously" (p. 2-3). Therefore, to increase the power of a test, the 
researcher must decrease the probability of a Type 11 error. "The smaller the Type 11 error 
(beta) the greater the power, and therefore, the greater the sensitivity of the test in detecting 
statistically significant difference" (Benton, 1 990, p. 3). 

To decrease the Type 11 error, the researcher can enlarge the number of observations 
and/or more precisely control the design of the experiment. Benton (1 990) also stated, "The 
two most common procedures for increasing power are 1 ) to increase the size of the sample, and 
2) to employ an experimental design that provides a more precise estimate of the treatment 
effects and a smaller error term" (p. 3). 

A major source of error variance in behavioral science research can be contributed t ) 
individual differences among the subjects. These can be controlled partially by carefully 
selecting and assigning subjects who are similar in their characteristics (Benton, 1 890). 
However, as in the IS7 pilot, the researchers cannot always carefully select subjects and must 
rely on statistical measures to reduce error variance. Methods such as blocking and ANCOVA are 
designs that can reduce error variance and improve estimates of treatment effects. 

LITERATURE REVIEW 

A popular experimental design in the social sciences and education involves the use of 
pretests and posttests (Hendrix, Carter, & Hintze, 1 978). To add to the informational yield of 
the pretest and posttest experiment, the researcher can incorporate an additional independent or 
assigned variable, if properly structured, the use of such a variable can reduce the unexplained 
variance and increase the design efficiency. Further, as Kennedy and Bush (1985) explained. 



"buiiding in assigned variables may even enable the experimenter to generalize his or her 

experimental findings across all levels of the assigned variable" (p. 349). The use of assigned 

variables is an attempt to reduce background noise or error variance. One specific strategy is 

the identification of a concomitant (or accompanying) variable that can be statistically 

correlated with the dependent variable. The concomitant variable is then used for subsequent 

blocking of observations (Kennedy & Bush, 1 985; Lentner, Arnold, & Hinkelmann, 1 989). 

By blocking, one is classifying or grouping subjects by their scores on the concomitant 

variable. The primary advantage of blocking is to increase the design efficiency. Kennedy and 

Bush (1985) explain: 

Efficiency is improved when the investigator's blocking efforts result in greater 
homogeneity among measures within the levels of the blocking variable than 
would otherwise occur in a completely randomized arrangement. Because the variance 
among scores is smaller within factor-level combinations (cells), and because the 
estimate of population error variance is based upon within-cel! variance, it follcws that 
design efficiency can potentially be improved (p. 351). 

In this study, the block designs employed were initially termed by Myers (1 972) as 
"treatment-by-blocks." This design is a multifactor approach used by researchers who have a 
concomitant variable available that correlates with the dependent variable. The concomitant 
variable is used to improve the efficiency of the design and increase the chance of documenting 
treatment effects. Another advantage is that this system of analysis allows for the assessment of 
statistical interaction that cannot be done with one-factor or two-factor block designs (Kennedy 
& Bush, 1985). 

The ANCOVA model offers another technique for reducing error variance and, thus, 
gaining statistical power. With ANCOVA, information is gathered from each subject on a 
concomitant variable. This variable, termed the covariable or covariate, is used to decrease 
error variance within a regression context (Kennedy & Bush, 1985). The dependent variable 
scores are regressed on covariable scores in the ANCOVA analysis. A one-way analysis of 
variance (ANOVA) is used on the resulting residual measures that represent the differences 
between scores expected in the least-square regression line and the actual dependent variable 



scores. Therefore, the regression model can account for a larger portion of each subject's 
dependent variable score (Kennedy & Bush, 1 985). Kennedy and Bush (1 985) stated the 
following: 

The salutary aspect of this consequence is that the great bulk of explained 
variability would constitute error variability in a standard one-way ANOVA. In 
ANCOVA, the explained variability is extracted and an analysis of variance is 
performed on the residual variability that is partitioned into two components: 
a) variability due to treatment group's differences, and b) variability which 
cannot be explained by the factor incorporated in the design of the study (error 
variability) (p. 396). 

Therefore, ANCOVA is generally more powerful than an ANOVA. 

Blair and Sawilowsky (1 991 ) agreed that the power of an ANOVA test is increased by the 
introduction of a covariate and would assist in controlling extraneous variables. As Elashoff 
(1969) concurred, "the covariance procedure would reduce possible bias in treatment 
comparisons due to differences in the covariate x and increases precessions in the treatment 
comparisons by reducing variability in criterion scores 'due to' variability in the 'covariate,' 
x" (p. 384). 

Rfiviftw of ANCOVA versus Blocking 
A review of the literature indicated conflicting results in the comparison of block 
designs to ANCOVA designs. Some of the literature supported the use of ANCOVA, while other 
literature supported block designs. The method to use in choosing between the two was also 
debated in the literature. 

Cochran (1 957) offered five advantages to using ANCOVA: its use can increase the 
precision in randomized experiments, remove effects of r:onfounding variables in observational 
studies, add to the knowledge of the nature of treatment effects, fit regressions in multiple 
classifications, and assist in analyzing data when observations are missing. Greenberg (1953) 
and Gourlay (1 953) were also among those who favored the use of ANCOVA. In similar studies, 
they compared ANCOVA to a matched block technique.. Both recommended ANCOVA over the 
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blocking. However, Greenberg added that when treatment groups are less than 1 0, blocking is 
preferred. 

Keppel (1 973) came to different conclusions in that he advocated the use of blocking 

over ANCOVA. Keppel offered the following advantages of blocking over ANCOVA: (1 ) the access 

to the block by treatment interaction, (2) the nonnecessity of variables being linear and (3) 

the ease in computation. In his discussion on the use of randomized block designs, post-hoc 

blocking, ANCOVA, and analysis of difference scores, he concluded the following: 

The analysis of covariance can be useful in increasing the precision of an experiment. 
The statistical model underlying its use is highly restrictive and thus not generally 
applicable. On almost every count, blocking is the method of choice and post-hoc 
blocking is a second-best technique to increase precision. The use of covariance should 
be questioned except in the simple clear cases, while the analysis of difference should 
generally be avoided (p. 51 6). 

Feldt (1 958) did a comparison of an ANCOVA, ANOVA of difference scores, and a blocking 
technique in which he termed a stratification of a factorial design. He found that the least 
effective procedure was the ANOVA of difference scores. Feldt indicated that the precision of the 
ANCOVA or the factorial approach depends upon the population correlation, p of the concomitant 
variable with the dependent variable. For p values < .4, the factorial approach is of equal or 
greater precision than the ANCOVA. For p > .6, ANCOVA is the more precise, and for p < .2 , 
neither the ANCOVA nor the factorial design is more precise than a completely randomized 
design. 

However, Maxwell, Delaney, and Dili (1 984) argued that using the correlation between 
the dependent variable and concomitant variable when choosing between blocking and ANCOVA is 
incorrect. Based on a Monte Carlo study conducted by Maxwell et al., it is argued that two other 
factors should be considered. This includes (1 ) v* . .her there is a linear relationship between 
the concomitant variable and dependent variable and (2) whether the scores ore available on the 
concomitant variables for all subjects before subjects are assigned to treatments. If the scores 
are available on the concomitant variable before subject assignment, Maxwell et al. found 
blocking to have more power. If the relationship between the concomitant and dependent 
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variables is linear, then ANCOVA is the recommended method. If the relationship is linear, 

Maxwell et al. recommended a two-way ANOVA or a generalized regression analysis. 

Yet, Bonett (1982) argued in a previous article that "ANCOVA can be used when the 

concomitant variable is not linearly related to the dependent variable assuming the correct form 

of the regression equation is fitted" (p. 38'). Further, Bonett pointed out that blocking is only 

more powerful than ANCOVA when the optimal number of blocks are used. To determine the 

optimal number of blocks, Bonett explained that the correlation of the dependent and 

concomitant variables must be known. However, Bonett explains that the ANCOVA does not 

require this information to be known. Bonett stated the following: 

The pooled within class regression coefficient is estimated directly from the sample size. 
To obtain maximum statistical power, the magnitude of the concomitant/dependent 
variable correlation must be known for the block design while the form of the 
relationship must be known for the ANCOVA (p, 37). 

In another study, Wu 0 993) made a comparison of power ii an ANOVA, ANCOVA, and a 
two-block, four-block, and eight-block design. A main difference between this study and Wu's 
was he used a Monte Carlo method to obtain his data, and this study used data collected from 
participants in the 1S7 program. By using simulated data, Wu was able to compare the designs 
under various levels of treatments, various number of subjects, and with different correlation 
coefficients. He found that when there is no correlation between the dependent variable and the 
concomitant variable, the one-way ANOVA is the more powerful. The block designs are more 
powerful when the correlation is low, and the ANCOVA is more powerful with high correlations. 
However, he found with moderate correlation, that the block design could be as powerful or 
more powerful than the ANCOVA when the number or treatments and number of subjects per 
treatment are large. Thus, he recommended a block design when the number of subjects and 
treatments are large, and an ANCOVA when they are not. Power increased in all five procedures 
"as the correlation coefficient, the number of treatments, and the number of subjects per 
treatment increased" (p. 27). 
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Problems with OVA Techniques 

Researchers have examined the problems with using OVA techniques. The use of OVA 
methods requires that the independent variables be nominally scaled (Prosser, 1 990; 
Thompson, 1 988) The obvious problem with this is most variables are scaled higher than 
nominal (Prosser, 1 990). Campbell (1 989) found two other difficulties related to the use of 
OVA methods. These flaws involve a reduction of power against Type II errors and a distortion of 
the relationships among and distribution of the non-interval predictor variables. Further, 
Campbell presented three problems associated specifically with ANCOVA. She found that the 
assumption of reliable measurement of the control variables is often overlooked, that 
researchers too frequently regard ANCOVA as a "magical" technique for equalizing dissimilar 
groups, and that the critical homogeneity of regression assumption is regularly ignored. 

Malgady and Colon-Malgady (1 991 ) argued that little is gained from using ANCOVA and 

suggest it is better to do a simple comparison of gain scores. They disputed the implication that 

ANCOVA is more reliable than an analysis of gain scores. In a comparison of the two designs, 

they found little advantage in the use of ANCOVA. They indicated that ANCOVA suffers reliability 

problems in that the pre- and post-test reliabilities decrease as their intercorrelation 

increases. Malgady and Colon-Malgady made further suggestions: 

Rather than inviting further calamity, such as failure to satisfy the additional and often 
untenable assumptions of analysis of covariance, researchers might do just as well to 
perform analysis of variance on simple difference scores when their reliability is 
adequate. When it is not, analysis of covariance is not likely to help (p. 807). 

However, the cautions of Malgady and Colon-Malgady (1 991 ) were previously examined 

and refuted by Ware and McLean (1 978), McLean (1 979), and McLean (1 989). Analysis of 

covariance and its use with different experimental designs was investigated by Ware and 

McLean. It was explained that ANCOVA is used correctly, to increase the accuracy of the design 

by reducing the unexplained, within-cell variance and incorrectly, to reduce differences among 

groups by adjusting dependent variable scores for the concomitant variable. The authors 

determined that when intdct groups are used or covariates have low reliabilities, ANCOVA should 
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be used with caution. However, the authors also warn against completely discarding the use of 
ANCOVA due to its limitations. An example is provided of a two-group experimental design in 
which groups were " 'equal' with respect to pretest means" (p. 18). When ANOVA was 
employed, no significant differences were found. Yet when an ANCOVA was employed, significant 
differences were found because the use of the covariate reduced the unexplained variation within 
the groups and increased the accuracy of the analysis (Ware & McLean, 1 978). Further 
investigations indicated that ANCOVA will increase the precision of an analysis if there is a 
significant relationship between the covariate and dependent variable. However, ANCOVA will 
not adjust for pre-existing, among-group differences (McLean, 1 979; McLean, 1989). 
Assumption Violations of ANCOV A 

The effect of assumption violations with ANCOVA has been discussed in the literature. 
Bennett (1 983) found that if the sample sizes are equal, the power of ANCOVA would not be 
affected in the presence of heterogeneity of variance. Hamilton (1 977) found that alpha levels 
can be maintained in the presence of heterogeneity of variance, but only when sample sizes are 
equal. However, Carver (1 976) reported contrasting results. He found that a variation in 
power would be discovered, regardless of sample size, depending on the degree of heterog3neity. 
Further, Hollingsworth (1 980) found that despite the size of the sample or the degree of the 
heterogeneity, heterogeneity of regression would affect the levels of alpha and power. 

McLean (1 979) also investigated assumptions associated with ANCOVA. He examined the 
importance of each assumption, illustrated methods for testing assumptions, and gave 
suggestions for alternative analyses when assumptions are not met. The assumptions explored 
by McLean are as follows: 

(1 ) that the cases are assigned randomly to treatment conditions, 

(2) that the covariate is independent of the treatment effect, 

(3) that the covariate is measured without error (i.e., with perfect reliability), 

(4) that the covariate is linearly related to the dependent variable, 

(5) that the regression of the dependent variable on the covariate is the same for each 
group, 

(6) that for each level of the covariate, the dependent variable is normally distributed, 
and 
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(7) that the variance of the dependent variable at the given value of the covariate is 
constant across treatment groups and is independent of the covariate (p. 3-4). 

In examining the literature, McLean found that, in most cases, ANCOVA is robust to violations of 

the assumptions, homogeneity of regression and normal distribution. If the assumption that the 

covariate is related to the dependent variable is not met, the ANCOVA is still valid, but no more 

powerful than an ANOVA and may be less powerful due to the loss of the degree of freedom. The 

assumption of homogeneity of variance requires the same amount of consideration with an 

ANCOVA as it does an ANOVA design. The "most important" assumption to be met, according to 

McLean, is the independence of the covariate and treatment. McLean explained that the 

assumption of perfect reliability is not possible in the social sciences and becomes less 

important if the covariate and treatment become independent. Random assignment is an 

assumption that is built into the experimental design, and a violation of this will usually 

evidence itself with a violation In one or more of ' he other assumptions. 

Assumption Violations of Blocking 

A block design is an alternative when the homogeneity of variance assumption is violated. 
Kennedy and Bush (1 985) found that if the number of subjects per cell is equal, an abundance 
of literature exists supporting the use of the block designs. It has been shown that the F tests in 
the block designs are robust against all but extreme violations of the assumption of homogeneity 
of variance (Kennedy & Bush, 1 985). 

Block designs can minimize the loss of information by accounting for the effects of 
nuisance factors that characterize the experimental material (Strange, 1 990). The expectation 
of the block design is that the scores within the blocks are as homogeneous as possible, and 
scores in different blocks are as heterogeneous as possible. Lentner, Arnold, and Hinkelmann 
(1989) maintained that when these two expectations are met, a block design will yield "better" 
inference with respect to treatment effects than a design without blocking. However, if the 
scores in different blocks are not more heterogeneous than scores within blocks, the influence 
will not be as good as a non-block design. 

er|c ii 
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Fawcett (1 990) did a comparison of designs to examine tlie benefit of blocking versus 
tlie cost of blocking. He compared a Latin Square design, Graeco-Latin squars design, and a 
completely randomized design to a randomized complete block design. The disadvantage of 
blocking included a loss in degrees of freedom and a stricter rejection of the null hypothesis. 
Nevertheless, the gain in the reduction of variability was substantial and Fawcett concluded that 
"the benefits of blocking more than compensated for the cost of blocking" (p. 205). 
Optimal Level of Blocks 

If a block design is to be employed, the question arises as to how many blocks are the 
optimal number. Kennedy and Bush (1 985) maintained that this question must be answered in 
regard to the purposes of the research. If the researcher is using the blocking variable because 
of its intrinsic interest, then the subject matter will influence the number of levels of the 
blocking variable, but if the blocking variable is used only to reduce the noise of the nuisance 
factors, the levels of blocks would ideally be the number that maximized efficiency. 

A discussion of the relationships that exist between the number of blocks and design 

efficiency is also presented by Kennedy and Bush (1 985) in relation to the 

treatment-by-blocks design. They indicate that the stronger the relationship between the 

dependent variable and concomitant variable, the greater potential for design efficiency. 

Further, a strong relationship between these variables implies that increasing the number of 

blocks will reduce the average within cell variance (MS within). Reducing the MS within will 

decrease the noise and increase efficiency. However, for each number of levels of blocks added 

there will be a decrease in the degrees of freedom, within cells, by one degree of freedom. The 

effect of this on the design, as well as the effect of the sample size, must be taken into 

consideration. Kennedy and Bush indicated that increasing the sample size will contribute both 

to the efficiency of the design and the power of the E test. Therefore, the determination of the 

number of blocks is influenced by the following: 

a) the correlation between the blocking variable and the dependent variable in the 
population, designated by Rho (p), b) the total number of sampling units (N) under 



1 



11 

study, and c) the number of levels associated with the treatnnent variable. Specific 
implications for determining the number of blocks in a treatment-by-blocks design are: 
a) the greater the magnitude of p, the greater the number of blocks, b) the greater the N, 
the greater the number of blocks, and c) the smaller the value of a, the greater the 
number of blocks (p. 372). 

The definition of a is the number of levels that comprise the treatment variable. To apply this 
information, Feldt (1958, cited in Kennedy & Bush, 1985) created a table ^-hat enables the 
researcher to determine the optimal level of blocks based upon an integration of these 
relationships. This table is limited, however, to N's of 1 50 or less. The current study 
contained a sample size of 1,802. 

PROCEDURES 

Since the data used in the study were derived from a pilot of a distance education 
project. Integrated Science 7 (IS7), a brief history of IS7 is provided. Following this, the 
procedures used in the pilot of IS7 are defined. Next, the subjects in the study are defined 
followed by a description of the materials used. Finally, the procedure used to conduct the 
current study is presented. 

Description of IS7 

The IS7 program is broadcast from The University of Alabama's Center for 
Communication and Educational Technology. Dr. W. L. Rainey is the Project Director at the 
center. W. L. Rainey (personal communication, January 13, 1993) explained that The 
University of Alabama in partnership with the State Department of Education, Alabama Public 
Television, and corporate sponsors developed the program titled "Integrated Science 7." The 
ongoing goals of this program include having students (grades six through eight) study science 
via satellite, making the sciences understandable and enjoyable for all students, and presenting 
the sciences in a sequenced, well coordinated, and engaging series that draws from biology, 
chemistry, physics, and earth and space science. The University of Alabama is providing 
personnel and ser^ 3S to be the broadcast cite where classes originate and are designed. 
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Integrated Science 7 emphasizes direct, hands-on experience and practical applications so that 
students can relate the science to their everyday lives. 

Procedure of IS7 Pilot 

As explainea by W. L. Rainey (personal communication, January 1 3, 1 993), the 
1 991-92 IS7 pilot program was beamed by satellite or shown on public television from studios 
at The University of Alabama to participating schools in Alabama, Georgia, Mississippi, Florida, 
and Oklahoma. The broadcast was interactive in approximately half of the programs, meaning 
that selected students could converse with the IS7 instructor during 1S7 class. The daily 
broadcast occurred, Monday through Friday for 30 minutes and was followed by 30 minutes of 
instruction in the classroom. The broadcast was conducted by a lead teacher and visiting 
scientists who typically introduced a science concept and demonstrated that concept on camera. 
The following half-hour was conducted by the cooperating teachers in participating schools. The 
cooperating teachers also presented iS7 curriculum material, recapped the broadcast, and 
guided students in forming, testing, and drawing conclusions to presented hypotheses. The 
cooperating teachers were provided with tutors based at The University of Alabama that were 
available by telephone following each broadcast. A computer bulletin board system was provided 
to each participating school so that the cooperating teachers could pose questions and contribute 
commentary on the IS7 pilot (Rainey, 1 993). 

Gender was a variable available in the iS7 pilot. Gender was of interest because of the 
demonstrated widespread differences among males and females in science achievement and 
attitudes, with males favored (Steinkamp, 1 982). This was also demonstrated by a nationwide 
assessment conducted by The National Assessment of Educational Progress which revealed 
achievement differences in science favoring males, especially in the physical sciences for 9-, 
1 3-, and 1 7-year old students (Crawley & Coe, 1 990). It has been well documented that 
females are underrepresented in the science fields (Dix, 1 987; Olstad, 1 981 ; Reat, 1 981 ). 
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Females, when compared to males, often avoid advanced science classes, are less motivated, and 
fail to see the usefulness of such classes (Khoury & Voss, 1 985). 

Due to the documented difficulties that students have in science (Nolen & Haladyna, 
1 990; O'Malley & Scanlon, 1 990; Trefil, 1 991 ), an obvious choice of investigation in the IS7 
pilot was science aptitude. It has been shown that as students progress through school in 
science, achievement levels steadily decline, as do attitudes (Cannon & Simpson, 1985), and 
this is true across grade levels (Simpson & Oliver, 1 985). For comparison purposes, the pre- 
and post-test aptitude scale was administered to all the seventh-graders participating in the iS7 
pilot. This comparison was made due to the demonstrated positive relationships between 
attitudes and achievement in science (Gardner, 1975; Omerod & Duckworth, 1975; Ward, 
. 376). Research has indiCdted that science attitude scales can be used to predict science related 
behavior (Shrigley, 1 990). Crawley and Coe (1 990) found attitude to be one of the sole 
predictors of whether eighth grade students chose to enroll in an elective hiyh school science 
course. 

Subiects 

The subjects for the study were participants in the 1S7 pilot during the 1991-92 school 
year. W. L. Rainey (personal communication, January 13, 1993) indicated that a nationwide 
advertisement was posted in March of 1 990 describing the IS7 program and informing schools 
how to participate. By May of 1 991 , school systems from Alabama, Georgia, Mississippi, 
Florida, and Oklahoma replied and paid fees to receive the IS7 broadcast and materials. Subjects 
for this study included the students in seventh-grade science classes whose schools participated 
in the IS7 pilot program. All IS7 pilot students were used as subjects; however, som j data were 
unreadable or not returned and could not be used. The total population size was 2,41 4. From 
this, subjects were discarded who had not returned all four scores from the pre- and 
post-testing of the attitude and aptitude scales or who had not recorded their gender. This 
yielded a sample size of 1 ,802 for purposes of this study. 

ERIC 
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Materials 

A science ability scale and science aptitude scale was used for the pre- and post-testing. 
The aptitude scale used was the instrument, Processes of Science (Yager, Blunt, & Ajam, 
1 990). The attitude scale used was the instrument. Attitudes, Preferences, and Understanding 
for Grades 4 through 1 2 (NEAP, 1 980). 

The pretest aptitude and attitude scales were mailed to participating schools from The 
University of Alabama in August of 1 991 and were administered in September of 1 991 . The 
cooperating teachers administered the scales to students and returned the forms to the 
University in September of 1 991 . Cooperating teachers were mailed the posttest ability and 
attitude scales in March on 1 992. The cooperating teachers administered these scales in April 
of 1992 and returned the materials to The University of Alabama in April and May of 1992 
(W. L. Rainey, personal communication, January 1 3, 1993). The data from the pre- and 
post-testing were stored and analyzed using The University of Alabama's IBM 3090-400E 
mainframe. The software packages used included the CMS operating system with SAS version 
6.07 (SAS, 1985). 

Procedure of Current Studv 

The treatment-by-blocks designs were formed using the posttest aptitude scores as the 
dependent variable, the pretest aptitude scores as the concomitant or blocking variable, and the 
gender of the subjects as the independent variable. Three levels of blocks were employed: a 
two-Mock, five-block, and ten-block design. These same designs were replicated, but the 
pretest attitude scores were used as the concomitant variable. The results from these analyses 
were compared to each other and to similar ANCOVA designs. 

Two ANCOVA analyses were also used and compared as to their effectiveness in increasing 
significance and power. As stated, the ANCOVA designs were also compared to the block designs. 
In the first ANCOVA analysis, the dependent variable was the posttest data from the aptitude 
scale, the independent variable was gender, and the covariate was the pretest data from the 
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aptitude scale. The variables used in the second analysis were identical, except for the 
covariate, which was changed to the pretest scores from the attitude scale. Therefore, the 
ANCOVA designs made use of the same variables as did the block designs. 

A comparison of the sensitivity of the designs used were made. Sensitivity was defined 
by the resulting p values and power values from the ANCOVA and treatment-by-blocks designs. 

Data Analysis 

A SAS program was written for each analysis and the p values were generated. The SAS 
code used for each analysis is presented in Appendix A. Pov>^er was determined using charts 
derived by Pearson and Hartley (1 951 , cited in Kirk, 1 968) that are based on a procedure by 
Tang (1 938, cited in Kirk, 1 968) for calculating power. Power is calculated by entering a 
parameter in the charts. 

Research Questions 

To answer the research questions jf the present study, an examination was done of the 
resulting values for p and power. The first research question (RQ) addressed whether there are 
differences in p values between the ANCOVA designs and the blocking designs. This comparison 
represented by the following: 
RQt : Pt = P2; Pi = P3; Pi = P4; 

Ps = ^6= ^5 = P?' Ps = ^8 

The second research question was identical to the first except comparing power values 
instead of p values. This comparison is represented in the same manner, but substituting power 
values for p values. 

RQ2: 1-Pt =1-P2;1-Pi = I-P3; 1-Pt= I-P4; 
1-P5 = 1-P6;1-P5 = 1-P7;1-P5 = 1-P8 

The third research question was that there is no difference in p values among the block designs, 
and the forth was that there is no difference in the power values. These are represented on the 
following page: 
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RQ3: P2 = P2 = P4, Pe = P7 = Pq 

RQ4: 1-P2 = ''-P3 = ''-P4; 1-P6= ''-^7= ""-Ps 

The fifth and sixth research questions compared the p and power values, respectively, stating 

that there is no difference between the aptitude and attitude values in the ANCOVA design in 

terms of p and power. They are represented in the succeeding relationships: 
RQ5: Pt = P5 

RQg: 1-Pi=1-p5 

Research question seven asked whether a difference exists between the p values in each of the 

block designs using the aptitude scores as the blocking factor and the p values of each of the 

block designs using the attitude scores as the blocking factor. The last research question 

addressed a comparison of the power values of each of the block designs using aptitude to eacn of 

the power values of the block designs using attitude. 

RQ7: Pz = Pe- ^3 " P?' ^4 " ^8 

RQg: 1-P2 = l-Pe; 1-P3 = I-P7; 1-P4 = ""-P? 

RESULTS 

The results from the ANCOVA designs are represented in Tables 2 and 3. The degrees of 
freedom (DF), sum of squares (SS), mean squares (MS), F values, and p values are presented. 
Table 2 gives the results from the ANCOVA using the pretest aptitude scores as the covariate, and 
Table 3 shows the results of the ANCOVA using the pretest attitude scores as the covariate. 
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Table 2 

ANCOVA Summary Table Using Aptitude as Covariate 



Source 


DF 


SS 


MS 


F Value 




Pr> F 


Model 


2 


27414.94 


13707.47 


757.67 


.0001 


Aptitude 


1 


27393.22 


27393.22 


1514.14 


.0001 


Gender 


1 


14.86 


14.86 


.82 


.3649 


Error 


1799 


32546.73 


8.09 






Corrected Total 


1801 


59961.67 









In this analysis, the overall model's p value was less than .01 ; therefore, it was significant. 
The r value, which is the correlation between the covariate and the dependent variable, was .46. 
The variable, gender, was not significant (p > .01 ), indicating that the scores of the males and 
females did not differ significantly. 
Table 3 

ANCOVA Summary Table Using Attitude as Covariate 



Source 


DF 


SS 


MS 


F Value 


Pr> F 


Model 


2 


22.10 


1 1.05 


.33 


.7178 


Aptitude 


1 


.38 


.38 


.01 


.9150 


Gender 


1 


21.63 


21.63 


.65 


.4205 


Error 


1799 


59939.57 


33 .32 






Corrected Total 


1801 


59961.67 









In this analysis, the overall model was not significant (p > .01 ). The r value was .002. The 
variable, gender, was not significant (p > .01 ), indicating that the scores of the males and 
females did not differ significantly. 

Assumptions Tested 

The assumption of independence of the covariate and treatment was tested for both 
ANCOVAs. In both designs, there was not a significant relationship (p > .05); this suggests that 
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there was no significant relationship between gender and the covariates. The assumption of 
homogeneity of regression was also tested and neither ANCOVAs indicated significance (p > .05), 
implying that there were no interactions between the covariates and the variable gender. Thus, 
the essential assumptions for ANCOVA were met. 

Tables 4, 5, and 6 are the results from the two-block, five-block, and ten-block 
designs, respectively, that used the pretest aptitude scores as the concomitant variable. Tables 
7, 8, and 9 are the findings from the two-block, five-block, and ten-block designs, 
respectively, which used the pretest attitude scores as the concomitant variable. The blocking 
variable is represented by Blocks. 
Table 4 

Two-Block Design Summary Table Blocking on Aptitude ^ 



Source. 



SS 



MS. 



F Value 



Model 
Blocks 
Gender 

Error 



2 
1 
1 

1799 



20423.54 
2C401.82 
1.01 
39538.13 



10211.77 
20401.82 
1.01 



Gender 



Error 



jL7M. 



15.40 
33626.85 



5.01 
18.72 



.82 



Pr> F 



4647.64 .0001 
928.29 .0001 
.05 .8306 



Corrected Total 


1801 


59961.67 








Table 5 

Five-Block Design Summary Table Blocking on Aptitude 


Source 


DF 


SS 


MS 


F Value 


Pr> F 


Model 


5 


26334.82 


5266.96 


281.31 


.0001 


Blocks 


4 


26313.10 


6578.275 


351.34 


.0001 



.3645 



Corrected Total 



1801 



59961.67 
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Table 6 

Ten-Block Design Summary Table Blocking on Aptitude 



Source 


DF 


SS 


MS 


F Value 


Pr> F 


Model 


10 


27497.91 


2749.79 


151.70 


.0001 


Blocks 


9 


27476.19 


3052.91 


168.43 


.0001 


Gender 


1 


20.39 


20.39 


1.12 


.2898 


Error 


1791 


32463.76 


18.13 






Corrected Total 


1801 


59961.67 









In the previous three block designs, the overall models were significant (p < .01 ). The 

concomitant variable was significant in all three designs (p < .01 ). The eta squared values were 

.34 for the two-block design, .44 for the five-block design, and .46 for the 

ten-block design. The variable, gender, was not significant in any of the designs, indicating that 

the scores of males and females did not differ. 

Table 7 

Two-Block Design Summary Table Blocking on Attitude 



Source 


DF 


SS 


MS 


F Value 


Pr> F 


Model 


2 


22.36 


11.18 


.34 


.7149 


Blocks 


1 


.65 


.65 


.02 


.8893 


Gender 


1 


21.83 


21.83 


.66 


.4184 


Error 


1799 


59939.31 


33.32 






Corrected Total 


1801 


59961.67 
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Table 8 

Five-Block Design Summary Table Blocking on Attitude 



Source 


DF 


SS 


MS 


F Value 


Pr> F 


Model 


5 


253.15 


50.63 


1.52 


.1794 


Blocks 


4 


231.43 


57.86 


1.74 


.1385 


Gender 


1 


20.75 


20.75 


.62 


.4296 


Error 


1796 


59708.53 


33.24 






Corrected Total 


1801 


59961.67 








Table 9 












Ten-Block Design Summary Table Blocking on Attitude 






Source 


DF 


SS 


MS 


F Value 


Pr > F 


Model 


10 


454.40 


45.44 


1.37 


.1893 


Blocks 


9 


432.68 


48.08 


1.45 


.1626 


Gender 


1 


21.26 


21.26 


.64 


.4238 


Frror 


1791 


59507.27 


33.23 






Corrected Total 


1801 


59961.67 









In the block designs using attitude, the overall models were not significant (p > .01 ), and the 
concomitant variables were not significant (p > .01 ). The eta squared values were .000009 for 
the two-block design, .0036 for the five-block design, and .0064 for the ten-block design. The 
variable, gender, was not significant in any of the designs, indicating that the scores of *he 
males and females did not differ. The table on the following page summarizes the sizes of the 
population (n), means, and standard deviations (SD) of the variables used in the study. 
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Table 10 

Descriptive Statistics for Study Variables 



Variable 



Covariate One 
(Pre Aptitude) 

Male 

Female 



Covariate Two 
(Pre Attitude) 

Male 

Female 



JL 



Dependent 
(Post Aptitude) 

Male 830 

Female 972 , 



830 
9Z2. 



830 
3IZ. 



Mean 



20.2 
20.4 



18.5 
MA. 



2.6 

2A. 



5.99 



5.97 
5.61 



.52 
.49 



Comparison of Analysis Procedures 
To examine ihe researcin questions, tine p values and computed power values were entered 
into the Table 1 1 . 
Table 1 1 

Comparison of Procedures 



Analysis 
Method 



ANCOVA 
Two-Block 
Five-Block 
Ten-Block 



Covariate or Blocking Variable 
Aptitude Attitude 



B 


Power 




Power 


.3649 


> .9999 


.4205 


< .32 


.8306 


> .9999 


.4184 


< .32 


.3645 


> .9999 


.4296 


< .40 


.2890 


> .9999 


.4238 


< .40 
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To explore the first research question, the p value of the ANCOVA design using aptitude 
was compared to each of the block designs using aptitude, and the p value of the ANCOVA design 
using attitude was compared to each of the block designs using attitude. The ANCOVA design using 
aptitude had a slightly higher p value than the five-block design using aptitude, and a higher p 
value than the ten-block design using aptitude. However, the ANCOVA displayed a lower p value 
than the two-block design using aptitude. When comparing the p values of the ANCOVA design 
using attitude to the p values of the respective blocking designs, the p value differences were not 
noteworthy. 

. When considering the second research question, an examination of the power values was 
done. The power values of the designs using aptitude are all extremely large. This is due to the 
large size of the sample which inflated the values for power. The power values of the designs 
using attitude were also similar due to the extremely small effect sizes. This can be attributed 
to the small overall differences between the sexes. 

To answer the third and fourth research questions, the p values and power values of each 
blocking design using the same blocking variable were compared to one another. The five-block 
and ten-block designs using aptitude had lower p values than the two-block design using 
aptitude. The ten-block design had a lower p value than the five-block design. Therefore, the 
larger the number of blocks, the lower the p value of the blocking desirins using aptitude. The p 
values of the block designs using attitude did not differ by more than .01 . As before, the power 
values did not differ for the designs using aptitude due to the large sample size, or for the 
designs using attitude due to the small differences in gender. 

Research questions five and six compared the p and power differences of the ANCOVA 
design using aptitude as the covariate to the ANCOVA designs using attitude as the covariate. The 
p value of the ANCOVA design using aptitude was lower than the p value of the ANCOVA design 
using attitude. In addition, the power value of the ANCOVA design using aptitude was much 
greater than the power value of the ANCOVA design using attitude. 
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Answering the last two research questions involved a comparison of the ')locl< designs 
using aptitude as the concomitant variable to the block designs using attitude. In comparing the 
two-block designs, the design using aptitude indicated a higher p value than the two-block 
design using attitude. This was the only instance in which a design using aptitude had a higher p 
value than a design using attitude, and a power comparison of these two designs did not favor 
attitude. The power of the two-block design using aptitude was much greater than the power of 
the twc-block design using attitude. 

The five-block design using aptitude had a lower p value than the five-block design using 
attitude. The p value of the ten-block design using attitude was almost twice as large as the 
ten-block design using aptitude. Both the five-block and ten-block designs using aptitude 
indicated much higher power values than the five-block and ten-block designs using attitude. 
DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS 
Comparison of Using Aptitude versus Attitude 

There was evidence to indicate that the pretest scores of the aptitude scale were better 
covariate and concomitant variables than the pretest scores from the attitude scale with the 
Integrated Science 7 (iS7) data. The overall models of the designs using aptitude were 
significant which implies that the models were useful. However, the overall models of the 
designs using attitude were not significant, demonstrating that the models were not useful. The 
covariate was significant in all the designs using aptitude which indicated that the pretest 
aptitude scores were an effective predictor of the aptitude posttest scores with the ANCOVA c'nd 
useful as a blocking factor for the blocking designs. The covariate was not significant in any of 
the designs using attitude, suggesting that the pretest attitude scores were not an effective 
predictor of the aptitude posttest scores with the ANCOVA and w^s not useful as a blocking factor 
with the blocking designs. The designs using the aptitude scores all indicated much greater 
pov:'r values than the designs using the attitude scores, and in ail but one comparison (the 
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two-block) the p values for the designs using aptitude were less than those using attitude. 
Lastly, the r value of the ANCOVA design using aptitude was much greater than tho r value of the 
ANCOVA design using attitude. 

Comparison of Designs Using Aptitude 
Given that the designs using aptitude were superior, a comparison of these designs was 
considered. The power values of the four designs using aptitude were all greater than .9999; 
therefore, distinctions could not be made using power. However, comparisons of p values could 
be made. 

Comparison of Block Designs Using Aptitude 
A comparison of the p values of the block designs using aptitude was considered to 
determine the optimal number of blocks among the blocking designs. The p values for the 
five- and ten-block designs were less than the p value for the two-block design. In comparing 
the ten-block design to the five-block design, the ten-block design had the lower p value. 
Therefore, among the blocking designs considered, the optimal blocking number with the IS7 
data appears to be 1 0. 

In comparing the p value of the ANCOVA design using aptitude to the blocks designs using 
aptitude, the ANCOVA indicated a lower p value than did the two-block design. There was not a 
noteworthy difference between the p value of the ANCOVA c.nd the p value of the five-block design 
in that the values only differed by .0004. However, the loss of degrees of freedom gave the 
five-block the advantage. The five-block design maintained a low p value despite losing more 
degrees of freedom than the ANCOVA. 

The ten-block design indicated a lower p value than the ANCOVA, with the difference 
between the two p values being .0751 . "l^he lower p value was accomplished by the ten-block 
design even though nine degrees of freedom were lost. Therefore, with the IS7 data, it appears 
more advantageous to use the ten-block design in comparison to ANCOVA. 
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The analysis that appeared to be the most useful with the IS7 data was the 
ten-block design using the aptitude as the concomitant variable, but to generalize this to all data 
would be in error. Unlike Greenberg (1 953'' and Gourlay (1 953) who indicated a preference 
towards ANCOVA or Keppel (1973) who revealed a bias towards blocking, there does not appear 
to be one optimal method. More evidence is available to suggest that the use of ANCOVA versus 
blocking is dependent upon several conditions (Maxwell, Delaney, & Dill, 1 984; Wu, 1 993). 
Cox (1 957) and Feldt (1 958) suggested that a determining factor should be the correlation 
between the dependent and concomitant variables. Wu (1 993) found similar results suggesting 
that a block design is the better choice when the correlation coefficient is low, and an ANCOVA is 
favored the when the correlation is high. However, Wu also found ANOVA to be the preferred 
method when no correlation exists. Wu added that if a moderate correlation exists, block designs 
should be selected when there are large numbers of treatments and subjects per treatment. The 
present study supported Wu's findings, since the ten-block using aptitude was the preferred 
method and the this design contained large numbers of subjects per treatment and a moderate 
correlation. However, Cox stated that a block design is preferred if the correlation is less than 
.4; Feldt said neither design is preferred if the correlation is less than .2; and Wu said ANOVA is 
the better design if there is no correlation. The correlation between the covariate and the 
dependent variable was .02 when attitude was used as the covariate. None of the designs 
indicated an advantage, a result supportive of Feldt and Wu but not Cox. 

Maxwell, Delaney, and Dill (1 984) also offered guidance in choosing between ANCOVA 
and blocking. They found that if the scores on the concomitant variable are available before 
subject assignment to treatments, then the block design is more powerful. Maxwell et al. also 
found that if the relationship between the concomitant variable and dependent variable is linear, 
ANCOVA is the more powerful. If the relationship is not linear. Maxwell et al. suggested a two- 
way ANOVA or a generalized regression analysis. This coincides with McLean (1 979), who 
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stated that when the dependent and concomitant variables are not related, ANCOVA is no more 
powerful than ANOVA and may be less due the loss of the degrees of freedorri. 

Discussion of Assumption Violations 
The two assumptions measured in the ANCOVA designs were homogeneity of regression 
and the independence of the independent variable and the covariable. Testing of these 
assumptions gave no reason to believe that they were not met in the present study (p > ,05). 
However, since random sampling was not part of the design of the IS7 pilot, the assumption of 
random sampling was not met in the present study. The effect of this on the ANCOVA analyses is 
not known. With other assumptions, McLean (1 979) found ANCOVA to be robust against most 
violations of the normality. He also stated that the assumption of homogeneity of variance 
"requires about the same amount of concern in ANCOVA as it does in the analysis of variance" 
(p. 7). Moreover, McLean pointed out that the assumption of perfect reliability is unattainable 
in the social sciences and is of less importance when the assumption of independence of 
treatment and covariate is met. 

Concerning block designs, Kennedy and Bush (1 985) reported that block designs appear 
to be robust against most violations of homogeneity of variance. The block design also does not 
require that the relationship between the concomitant variable and dependent variable be linear 
(Keppel, 1 973) or that a high correlation exist between the dependent variable and concomitant 
variable (Cox, 1957; Feldt, 1958). 

Suggestions for Further Research 

Since the iS7 data meet the assumptions of homogeneity of regression and independence of 
covariate and independent variables, researchers could look at the effects when assumptions are 
not met to determine if blocking would be a better choice. The assumption of random sampling is 
not met in this study and could have impacted the results. Further research could examine data 
obtained through random sampling to determine its impact. The data also did not indicate a 
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significant difference between the genders. A gender difference could be forced by a researciier, 
and tiien a comparison made between ANCOVA and blocking. 

Arguments were presented that the ANCOVA using aptitude was the best choice for the IS7 
data when compared to the other designs. However, none of the designs, including the ANCOVA, 
indicated a very high r value. 

The present study indicated power values of greater than .99 for the models using 
aptitude as the concomitant or covariate. The inflation of the power values is due to the large 
sample size. Other research could examine ANCOVA and block designs using smaller sample 
sizes to obtain lower power values. 

The present study indicated that the ten-block design using the pretest aptitude scores as 
the concomitant variable had the lov.est p value. It would be of interest to determine if 
increasing the number of blocks by more than ten would have continued to produce lower values 
of p . Further research could examine increasing the number of blocks until each subject's 
score became its own block to assist in determining the optimal number of blocks. 

The results of this study are limited by the IS7 pilot data but give insight into the uses of 
ANCOVA versus blocking designs. These results can be used to support past research and spur 
the interest of future researchers. As Wu (1 993) stated in his work, "The greatest 
contribution of this study might not be the specific results reported, but the potential for 
examining other situations" (p. 29). 
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